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PATTERN AVOIDANCE FOR RANDOM PERMUTATIONS 


HARRY CRANE AND STEPHEN DESALVO 


Abstract. We use techniques from Poisson approximation to prove explicit error bounds 
on the number of permutations that avoid any pattern. Most generally, we bound the total 
variation distance between the joint distribution of pattern occurrences and a corresponding 
joint distribution of independent Bernoulli random variables, which as a corollary yields a 
Poisson approximation for the distribution of the number of occurrences. We also investigate 
occurrences of consecutive patterns in random Mallows permutations, of which uniform 
random permutations are a special case. These bounds allow us to estimate the probability 
that a pattern occurs any number of times and, in particular, the probability that a random 
permutation avoids a given pattern. 
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1. Introduction 

A permutation of [n] = {1,...,n} is a bijection a : [n] —> [n], i i-> o{i) = Oj, written 
o = For each n = 1,2,..we write S„ to denote the set of permutations of [n]. 

Given permutations a = and t = ti • • • t^, we say that a avoids t if there does not 

exist a subsequence 1 < h < ■ ■ ■ < im < n such that ajj • • • a;,,, is order-isomorphic to t, and 
we say that o avoids z consecutively if there is no; = l,...,n-m + l such that OjOj+i ■ ■ ■ Oj+,„-i 
is order-isomorphic to t. Here we study pattern avoidance probabilities for a wide class 
of random permutations from the Mallows distribution, which is of particular interest in 
the fields of statistics and probability but special cases provide insights into questions in 
enumerative and extremal combinatorics. 

An inversion in a = ui • • • is a pair (z, j), i < j, such that a, > Oj. For example, o = 34125 

has four inversions, (1,3), (1,4), (2,3), (2,4). We write inv(a) to denote the set of inversions 
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of a. With denoting a random permutation of [n], the Mallows distribution with parameter 
q e (0, oo) on S„ assigns probability 

(1) P{E„ = a} = P^(a) = f oeSn, 

where Iniq) = Tl'j=i inversion polynomial and | inv(a)| is the number of inversions 

in o. Note that q = 1 corresponds to the uniform distribution on Sn, i.e., P{E„ = a} = 1/n! 
for all o e S„, and is the critical point at which the Mallows family switches from penalizing 
inversions, q <1, to favoring them, q > 1. Results for pattern avoidance probabilities under 
the uniform distribution translate to enumerative results for the number of pattern avoid¬ 
ing permutations. Combinatorial inequalities bounding the number of pattern avoiding 
permutations obtained from our techniques are contained in Equations pT] | and 

The Mallows distribution [28| was introduced as a one-parameter model for rankings that 
occur in statistical analysis. More recently. Mallows permutations have been studied in the 
context of the longest increasing subsequence problem |6 | and quasi-exchangeable random 
sequences Ifl^l20l . Our study of pattern avoidance for this class relates to recent work in 
the combinatorics literature, which considers the Wilf equivalence classes of the inversion 
polynomials for permutations that avoid certain sets of patterns |T0j[l6l. For general values 
of q > 0, we consider the problem of consecutive pattern avoidance for random Mallows 
permutations, but our main theorems go quite a bit further by establishing explicit error 
bounds on the entire distribution of the number of occurrences of patterns in a random 
permutation. In the uniform setting {q = 1), our method provides an estimate for the 
number of permutations with a prescribed number of occurrences as well as the number of 
permutations with a prescribed number of consecutive occurrences of a given pattern. Our 
main theorems, therefore, complement prior work by Elizalde & Noy IllSl . Perarnau Il32l . 
and the more recent work by the current authors & Elizalde Ill4ll on consecutive pattern 
avoidance, as well as Nakamura IMl , who used functional equations to enumerate sets with 
a prescribed number of occurrences of a given pattern. Other related works include Il3^ for 
patterns t of length n - 1, and IlMl for patterns t of length n - 2. And it is left as an open 
problem in II38I to perform the same analysis for patterns t of length n -3. 

Our approach also differs from previous work in a few key respects. While most prior 
work seeks either exact or asymptotic enumeration of the sets that avoid a given pattern 
or collection of patterns, we use the Chen-Stein Poisson approximation method BT^ . in 
particular El, to bound the total variation distance between the collection of all dependent 
indicator random variables indicating pattern occurrence at a prescribed set of indices, 
and a joint distribution of independent Bernoulli random variables with the same marginal 
distributions. From these bounds, we can approximate the probability that a random 
permutation avoids a given pattern, i.e., the pattern occurs zero times, or contains any 
number of occurrences of that pattern. 

We summarize our main theorems in Section!^ 


2. Motivation 

Restricted permutations fall into two broad classes. The first, more tractable type is of 
the form o{a) b for a,b e [n], whose study dates to the classical problemes des rencontres 
in the early 1700s IIT^ : see also O Chapter 4]. A special case counts the number D„ of 
derangements of [n], i.e., permutations of [n] without fixed points, yielding the asymptotic 
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expression 


( 2 ) 


n 


Dn = n\Y^ 

i=0 


i\ 


n\le as n 


oo. 


Equation (|^ can be stated in probabilistic terms by letting be a uniform random 
permutation of [n], i.e., P{En = a} = 1/n! for each o e S„, in which case 


(3) P{E„ is a derangement} = D„lnl ~ 1/e as n —> oo. 

See for more thorough treatments involving the cycle structure of random permuta¬ 

tions. 

We can also derive the expression in (|^ by Poisson approximation. With W denoting 
the number of fixed points in a random permutation of [n], we demonstrate in Section [4}2} 
see also fS} Chapter 4], that the distribution of W converges in total variation distance to 
the distribution of an independent Poisson random variable with expected value 1. In 
addition to the asymptotic value for the probability that a random permutation has no fixed 
points, this approach bounds the absolute error of probabilities that involve any measurable 
function of the number of fixed points in a random permutation. 

The second type of restriction is pattern avoidance, which attracts increasing attention in 
the modern probability ||6H2Tl|22l and modern combinatorics literature IZl. Any sequence 
of distinct positive integers w = wi ■ ■ ■ Wj^ determines a permutation of [/c] by reduction: with 
..., a;(jt)}< denoting the set of elements listed in increasing order, we define the map 
i-> i, under which w maps to a permutation red(re) of [k], called the reduction of w. For 
example, w = 826315 reduces to red(a;) = 625314. We call any fixed t e Sm a pattern and 
say that o e S„ contains t if there exists a subsequence 1 < h < ■ ■ ■ < im < m such that 
red(a;j • • • = t. We say a e Sn avoids t if it does not contain it. We say that a contains 

T consecutively if there exists an index j e [n - m + 1] such that red{ojOj+i ■ ■ ■ Oj+m-i) = t; 
otherwise, we say o avoids t consecutively. For any pattern t, we define 

•SniT^) := {o e Sfi : o avoids t} and 


(4) Sn{v) := {a e iSn : a avoids t consecutively}, 
which we extend to any subset A c 1J„>^ by 

(5) SniA) := {a ^ S,, : o avoids all z e A} and 


SniA) := {o e S„ : o avoids all t e A consecutively}. 


Much effort has been devoted to exact enumeration of Sn (A) for certain choices of A, see, e.g., 
IIUIlllZl|25l. For the most part, we are interested in sets ^^(t) containing all permutations 
that avoid a given pattern t, though our approach extends in a straightforward manner for 
more general sets A. 

Attempts to enumerate S„(z) are notoriously difficult for patterns of fixed length larger 
than 3. BCnuth 1126 1 initiated interest in pattern avoidance in the study of algorithms by 
identifying the 231-avoiding permutations as exactly those that can be sorted by a single 
run through a stack; see Bona [Zl Chapter 8] for further discussion. In fact, it is now well 
known that the avoidance sets ^„(t) for every length-3 pattern t are enumerated by the 
Catalan numbers Il37l A000108]: 

I + 1), T € {123,132,213,231,312,321}. 
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Just as in the derangement problem above, enumeration of Sn{A) has an elementary 
probabilistic interpretation that motivates much of our paper. With denoting a uniform 
random permutation of [n] and A a set of permutations, the probability that L„ avoids A is 

P{I,„ avoids every t e A} = | Sn{A)\ln\. 

The Stanley-Wilf theorem Il29l states that | ^n(T)| grows exponentially with n for every fixed 
T. For example, the Catalan numbers are known to grow asymptotically like 4”/ Vn n^, 
yielding the asymptotic avoidance probability 

l2n\ 1 /'4e\” 

P{I,„ avoids 231} = /(n + 1)!-— (—) as n —> oo. 

\nj 

Such calculations quickly become intractable. For example, the sets of 1324-avoiding 
permutations have only been enumerated up to n = 31 Il23ll . Even precise asymptotics for 
I iS„(1324)| have not yet been established lISlI^IT^. 

3. Main Results 

3.1. Definitions. Throughout the paper, we write X(X) to denote the distribution, or law, 
of a random variable X and £,{Y \ X) to denote the conditional distribution of Y given X. 
For random variables X and Y, we write drv(X(X), X(T)) to denote the total variation 
distance between the distributions of X and Y, which in the special case of non-negative 
integer-valued random variables can be computed as 

dTvUiX), £iY)) = -Y^ |P(X = n) - P(y = n)|. 

n=0 

Define the set of all unordered, distinct /-tuples of elements from [n] by 

Fj := {ih,...,ijj : 1 < h < ■ ■ ■ < f < nj, j e [n]. 

For each a € Tj, let be the set of all /-element subsets of [n] that overlap with a in 
at least one element, i.e.. Da := {fi e Tj : fi n a 0}. For example, if a = {1,5,8}, then 
= {{/D/2,/3} : p ^ {h5,8},i = 1,2,3}. 

With Xn denoting a uniform random permutation of [n], a e T j, and t a fixed pattern 
of length /, we define X^ = X,j^ . as the indicator random variable for the event that the 
reduction of at positions q ,... ,im form the pattern t, i.e., 

(6) := I(red(E„(q) • • • Xn{im)) = r). 

Let X = Xj := (Xa)aerj, and let B = By = (Ba)aer^ denote a joint distribution of independent 
Bernoulli random variables with marginal distributions satisfying E = E X^ for all a € Tj. 
The random variable 

( 7 ) T, . 

counts the total number of occurrences of t in E„. 

For any t e Sm, for each s = 1,..., - 1 we define Ls(t) as the overlap of size s, i.e., the 

number of permutations a e ^ 2 m-s such that there are indices 1 < ii <■■■< im < 2m - s and 
1 < /i <•••</„,< 2m - s such that {I'l, ..., i,„} and {/i, ..., jm} have exactly s elements in 
common and 


red(a,i • • • OiJ = red{oj^ ■ ■ ■ OjJ = t. 
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For consecutive pattern avoidance, we similarly define the set of all /-tuples of the form 
{i,i + 1,... ,i + j - 1], 1 < i < n - j + 1, as 

fj := {{i,i + 1,... ,i + i - 1} : 1 < i < n - j + 1}, j e[n- j + 1]. 

Let X = Xj := / and let B = By = (Fa)g,gf denote a joint distribution of independent 

Bernoulli random variables with marginal distributions which satisfy E = E for all 
a e Tj. Next, we define analogously the random variable 

(8) FV ^s,s+l,...,s+m-lr 

l<s<n-m+l 

where Xs,s+i,..,,s+m-i is defined as in Q for a fixed pattern t e Sm and a uniform random 
permutation We also define Ls(t) as the sequential overlap of size s, i.e., the number of 
permutations a e Szm-s such that 

red(ai • • • a^) = red(am-s+i • • • 02 m-s) = t. 


3.2. Main corollaries. We begin with several limit theorems which follow from the quanti¬ 


tative bounds given in Section 3.3 


Corollary 3.1. Fix any q > 0, and let j = i{n) he some increasing, integer-valued function ofn 
such that j > {e -l- q) xfn. For any sequence of patterns t„ € Sj, and for any measurable function 

(n\ 

h : {0,1} V'' ^ R and Borel set A c R, as n tends to infinity we have 


P(B(X) e A) = P(B(B) € A) + o(l). 


We also have an analogous theorem for consecutive patterns, but with a stronger result. 

Corollary 3.2. Fix any t> 0, and define M{t,n) := [ iogiog(n/!Hog!ogiog(n/f) “ l\- ^etm = m{n) he 
some increasing, integer-valued function ofn such that m > M{t,n). For any sequence of patterns 
T„ e Sm, and for any measurable function h : {0, — > R and Borel set A c R, as n tends to 

infinity we have 

P(B(X) e A) = P(B(B) e A) + o(l). 

In particular, denoting by Y tan independent Poisson random variable with parameter t, and taking 
m{n) = M(t, n), we have 

djy{W,Yt)=o(-\, 

\ml 

whence, 

\S„{t„)\ ~ nle~^ AS n — > oo. 


Finally, we present an analogous result for permutations chosen according to the 
Mallows((^) distribution. 

Corollary 3.3. Let m = m{n) be a non-decreasing integer-valued sequence, t„ € Sm be a sequence of 
patterns, and q = q{n) he a sequence of parameters. For each n>l, let L„ be a random permutation 
from the Mallows distribution 0 with parameter q(n), with Xq and B^ defined analogously. For 
any measurable function h : {0, 1—> R and Borel set A c]R, we have 

P(MX<j)eA) = P(MBg)eA)+o(l), 
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provided either 

q{n) < for almost all n>l, 

q{n) > 2 for almost all n>l, 

I inv{zm{n))\ ^ “ log(n)/ log((^(n)) for almost alln>l and q <1 or 
I inv{Tm{n))\ ^ “ ^og{n) / \og{q{n)) + m{n)^l2 for almost alln>l and q > 1. 

3.3. Quantitative bounds for large patterns. Corollaries |3.1| and |3.2| provide an asymptotic 
analysis for sequences of patterns which also grow in size. It is too much to expect a 
general asymptotic formula for any fixed pattern—we have already noted the difficulty of 
nailing down the asymptotic growth of 1324-avoidmg sets—^but Poisson approximation, 
see Section]^ provides a general approach for obtaining quantitative bounds on various 
quantities when all sizes are fixed. 

Theorem 3.4. Assume n > j > 3, and t is any pattern of length j. Let X = {Xa)aeTj, (ind let 
B = iBa)aeTj denote an independent Bernoulli process with marginal distributions which satisfy 
E Ba = E Xafor all a e Py. We have 

(9) dTv{j:{x),m))<^^n,i + ^-, 


where D„; = min(l, A ^)(di + ^ 2 )/ 


y-1 


( 10 ) 




S=1 


2L,(t) 


]'■ 

n - j 

i 


il2' 


[2j-s/(2j-s)!- 

Furthermore, for Y a Poisson random variable with mean A = E W, zee have 

dTv{£{W),£{Y)) < 2D„y, 

and also 

(11) n! e-^ (1 - e^D„y) < \ ^„(t)| < n! e"^ (l + e^D„j ). 


Remark 3.5. There are several noteworthy aspects to Theorem 3.4 


(1) The expression for A is the same for all patterns of length j. 

(2) The expression for di cannot be improved by our approach. 

(3) We are unaware of any efficient means to calculate the values Ls(t) in general. For a simple 
and explicit upper bound, applicable for all patterns t of length j, we suggest 


( 12 ) 


(^2 ^ 


/-I 




but we suspect this bound can be improved. 

(4) As Theorem ^A\ suggests, these bounds are most useful when D„y < 1, i.e., when j is large. 
Theorem 3.6. Assume n>m> 3 and z is any pattern of length m. We have 


( 13 ) 


dTv(x(x)),X(B))<4D„,^ + ^, 
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where Dn,m = min(l, 1/A)(di + ^ 2 ), 


A = 


n-m 

m\ 


di 


ni 

m\'^' 


- {n-2m + s)2Ls{T) 

2“^ {2m-s)\ 


ni = 2mn - 3m^ + m. 


Let Y denote a Poisson random variable with parameter A = E W. We have 

dTv(£(^,£(y))<2D„,m, 

and also 

(14) n! e-^ (l - e^D„,^) < | < n! e"^ (l + e^Dn,m] • 

3.4. Limitations of Poisson Approximation. It is tempting to conjecture that Corollary |3.1| 
holds even when A tends to some fixed positive constant, but we suspect this is not possible, 
which we now demonstrate. First, we note the following. 

Lemma 3.7 ( Il30l l. Fix any t > 0 and let A = (y)//!. Then A —> tfor 

1 1 11 

(15) i -^e^Jh-- log(n) - - log(27zt) “ “ 2 asn^oo. 

Lemma 3.8. Suppose n, j and n - j tend to infinity, then we have 

di ~ A^ (1 - . 

In particular, for ; ~ e xfn - | log(n), we have di —> c e (0, 00 ), 


Thus, a necessary condition for di to tend to zero is 


(16) 


•y/n- 


^)log(n). 


It is also well known, see Il^l27l . that the typical size of the longest increasing subsequence 
is asymptotically of order 2 y/^, and so one cannot have a Poisson limit theorem which 
applies to the increasing pattern 12... j. It would be interesting to investigate the behavior 
in the gap, i.e., for / ~ c xfn with any 2 < c < e. 


3.5. Consecutive pattern avoidance for Mallows permutations. In Section]^ we discuss 
several special properties of the Mallows distribution that are helpful for studying consec¬ 
utive pattern avoidance. Using these properties, we obtain analogous bounds to those in 
Theorems 13.41 and l3.6l _ 

Recall the definition of the restriction L„\a of L„ to a subset Ac [n], and recall T,„ denotes 
the set of subsets of size m whose elements are consecutive in {1,2,..., n}. 

Theorem 3.9. Fix q > 0 and let L„ ~ Mallows{q). For any m>2, let Zm be any pattern of size m. 
For any a e T^, let 

Let W = L^gf ^ independent Poisson random variable with expected value 

A = E W. Then 


dTv{£m,£{Y))<2{bi+b2), 
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where 


A = (n -m) 


^\inv{z,„)\ 

^mih) 


rii = 2mn - 3m^ + m. 


hi 




m-1 

hi = 

S=1 


-2m + s) y 

peLsiZm) 


q\ mv(p)\ 

hlm-siq) 


The key to obtaining asymptotic formulas and Poisson limit theorems for general Mallows 
permutations relies on the interplay between the parameters n, m, | inv(T)|, and q. In 
particular, we need the expected number of occurrences to converge to a constant A € (0, oo). 
In the case of consecutive pattern avoidance, the expected number of occurrences of a 
pattern t € in ~ Mallows(ij) is 

which, for m fixed, produces non-trivial limiting behavior as long as 

^ n“^/bnv(T)| QJ. _ „l/((2)-|inv(T)|)_ 


We can also allow m to vary and keep q fixed so that 

I inv(T)| ~ - log(n)/ \og{q) or | inv(T)| ~ - log(n)/ \og{q) + m^/2. 


Combined with Theorem|3.9j these observations yield Corollary |3.3| 

In Section 5.3.2} we demonstrate Theorem |3.9|fo r all patterns of length 3. In Section [O} 
we compute the bounds in Theorems 3.4 and |3.9| for the specific patterns 2341 and 23451 
and we plot the estimated pattern avoidance probabilities in the appropriate asymptotic 
regime for q from Corollary |3.3| 


4. Poisson approximation 

4.1. Chen-Stein method. Stem's method is an approach to proving the central limit theo¬ 
rem that was adapted by Chen to Poisson convergence IH^ . The advantage of this method 
is that it provides guaranteed error bounds on the total variation distance between the 
distribution of a sum of possibly dependent random variables and the distribution of an 
independent Poisson random variable with the same mean. 

Theorem 4.1 (Chen 11121 ). Suppose Xi, X 2 , .. .,X„are indicator random variables with expectations 
Pi, Pi, ■■■ ,Pn, respectively, and let W = TJi=i ^i- het Y denote an independent Poisson random 
variable with expectation A = Pi- Suppose, for each i > 1, a random variable Vj can be 
constructed on the same probability space as W such that 

X(1 + Vi) = £iW I Xi = 1). 


Then 

1 _ -A ” 

djvUm, £{Y)) < IW - y,- 

!=1 


( 17 ) 
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4.2. Fixed Points Example. As a simple example, let e{n) denote the number of fixed-point 
free permutations of n. With a imiform permutation of [n], we define indicator random 
variables 

Xi = I( i is a fixed point of E„ ), i = 1,., n. 

(Note that these random variables are not independent.) We then define the sum 

n 

W = YjX, 

i=l 

SO that P(W = 0) = e{n)ln\ and A = EW = ^ = 1/ the expected number of fixed points. 

Even before we proceed with the bound, we obtain the heuristic estimate of n! e~^ for e{n), 
just as in (|^. 

To apply Theorem |4.H we need to construct W and 1 + Vi on the same probability space, 
and construct an explicit coupling. This is done for more general restrictions in |5j, but we 
shall write out the full calculation on fixed points to demonstrate how one can construct 
such a coupling. 

The random variable 1 -l- V, is the random sum W conditioned on X; = 1. For a random 
permutation a, suppose o{i) = j, for some j e [n]. The coupling is: swap elements i and j. 
The resulting permutation has the same marginal distribution as a random permutation 
conditioned to have a fixed point at i. In fact, | W - y,j e {0,1,2} for each i since we modify at 
most 2 elements, and the elements not involved in the swap cancel out (i.e., any fixed points 
occurring on indices other than these swapping positions remain unchanged). Let us denote 
the random variables after the coupling by Xj, X^,..., Xj,; that is, X(Xp = X(Xy | X, = 1), so 

that 1 + Vi = X'. We have 


\W-Vi\ = 


x, + ^(x,-x;) 




1 0, a{i) + i, i not in a 2-cycle, 

L o{i) = i, 

2, i in a 2-cycle. 

The probability that two given elements i and j are part of a 2-cycle is precisely l/(n(n - 1)), 
and the probability that i is part of a 1-cycle is Ijn. Thus, 

12 3 

E|w-y,| = - + - = - 

n n n 

and Equation p7| becomes 

drv(X(W),X(Y)) < (1 -e-i)^ 1- = 

i—i n n n 


i=l 


For all n > 1, we have 
|P(W = 0)-P(y = 0)1 


e{n) 

n\ 



< sup. |P(W = i) - P{Y = 01 < dTvW Y) < 


3(1-e-^) 
n 


Rearranging yields 

n\e~^ - 3(n - 1)!(1 - e~^) < e(n) < n\e~^ + 3(n - 1)!(1 - e~^). 


Note that this is a guaranteed error bound that holds for all n > 1, and as a corollary we get 
e(n) = n\e~^(l + o{n~^)). 
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The error bounds derived from the Chen-Stem method can be improved in special 
cases, e.g., e{n) above can be obtained exactly by rounding n\/e to the nearest integer for 
all n > 1, but the appeal of Poisson approximation is that it applies more generally Our 
main theorems (Theorems 3.1 and 3.2 1 identify cases in which the bounds provide an 
asymptotically efficient estimate. 


4.3. The Arratia-Goldstein-Gordon Theorem. Arratia, Goldstein, & Gordon ||2l provide 
another approach that is sometimes more practical for Poisson approximation. 

Theorem 4.2 (Arratia, Goldstein, & Gordon |2]). Let I he a countable set of indices and for each 
a ellet he an indicator random variable. Let X = {Xa)aei denote a collection of Bernoulli random 
variables, and let B = (Ba)a 6 i denote a collection of independent Bernoulli random variables with 
marginal distributions which satisfy IE B a = EXa for alia e 1. Define pa := EX,* = P(Xa = 1 ) > 0 
and pap := EX^Xp. Also define W := Lae/^a A := E W = LaeiPa- Por each a el, define 
sets Da c I (typically, this will be the set of all indices f € I for which and X^ are dependent, but 
this is not necessary) and the quantities 

ael ^eDa 

(18) &2 := L L Pa/ 3 / and 

ael ai=jieDa 

&3 := ^ E |E {Xa - Pa I o{Xp : f i Da)}|. 

ael 

We have 

drvUiX), X(B)) < 2{2bi + 2&2 + ^ 3 ) + 2 pi 

ael 

Furthermore, let Y denote a Poisson random variable with mean A = E W. We have 


drv(X(W),X(Y))<2(&i + &2 + l’3)/ 

and also 

|P(W = 0) - P(Y = 0)1 < (bi +b2 + b3) 

A 

In our applications, we are able to define sets Da, a e I, so that bs = 0 always holds; 
whence, the calculations of the bounds in our theorems require only calculations involving 
first and second (unconditioned) moments. For uniform random permutations this is 
straightforward, but that the analogous properties hold for consecutive patterns under 
random Mallows permutations is less obvious. 


5. Consecutive pattern avoidance of Mallows permutations 

To fix notation, we write a = ui • • • to denote a generic permutation, for which we 
define the reversal by a'" = For any subset A c [n], we write o\a to denote the 

restriction of a to a permutation of A obtained by removing those elements among oi, ■ ■ ■ ,o„ 
that are not in A. For example, with a = 867531924 and A = {1,3,5,7,9}, we have a |,4 = 75319. 
We write to denote a random permutation of [n]. 

From 0 it is apparent that P^(a) = P]l^{o'') for all a e Sn, and so we can focus on the case 
0 < ij < 1 in our analysis. 
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5.1. Sequential construction. The Mallows distribution 0 enjoys several nice properties 
that are amenable to the study of pattern avoidance. These properties are readily observed 
by the following sequential constructions, both of which are well known and have been 
leveraged in previous studies of the Mallows distribution; see, for example, ||6j[l9l. While 
the properties below are well known, we are not aware of their appearance in relation to 
pattern avoidance. We provide proofs for completeness. 

For q > 0, we say that random variable X has the truncated Geometric(q) distribution on 
[n], written as X ~ Geometric(n, q) , when the point probabilities of X are given by 

(19) P"''?(X = k) = + k = l,...,n. 

A Mallows permutation can be generated from the truncated Geometric distribution in two 
ways, which we call the ordering and bumping constructions. 

For the ordering construction, we generate Xi, X 2 ,... independently, with each X„ dis¬ 
tributed as Geometric(n, 1/^j). To initialize, we have Xi = 1, the only permutation of [1]. 
Given = ui • • • o„ and X„+i = k, we define 

X,2+i = ui • • • Ok-i{n + l)ajt ■■ - On- 

For every n = 1,2,..., it is apparent that X„ is a Mallows((^) permutation because the 
probability that element n + 1 appears in position k of E„+i is 

P{L„+i{k) = n + l}= P"+F1/9(x = k)= = n + l-k) = q^+^-^/{l + q + ---+ q"). 

Since Xi,..., X„ are chosen independently and each event {X„ = 0 } corresponds to exactly 
one sequence Xi,..., X„, we observe 

P{X„ = a} = oeSn, 

as in Q. 

Definition 5.1 (Mallows process). A collection generated by the ordering construction 

for fixed q > 0 is called a Mallows(ij) process. 

For the bumping construction, we generate Xi, X 2 ,... independently with each X„ 
distributed as Geometric(n, 1/ij) as before, and again we initialize with Ei = 1. Given 
Xn = oi - ■ ■ o„ and X„+i = k, we obtain E„+i by appending k to the end of E„ and "bumping" 
all elements of E„ that are greater or equal to k. More formally, (E„, X„+i) i-> E' • • • E(jX„+i, 
where 

X' — i ~ ^n+l/ 

i 1 Ey, otherwise. 

For example, if E 5 = 24135 and Xg = 3, then E 6 = 251463. Again, the resulting distribution 
of E„ is Mallows((y) because X„+i = k introduces exactly n + 1 - k new inversions in X„+i 
and Xi, X 2 ,... are generated independently. 


5.2. Properties of Mallows permutations. Throughout this section, we let (E„)„>i be a 
family of random permutations so that each E„ is a permutation of [n]. We say that (E„)„>i 
is consistent if for all 1 < m < n 

(20) P{E, 2 |[„j] = 0 } = P{E,„ = cj}, cT £ Sfft ■ 

It is immediate from the ordering construction that the Mallows process (E„)„>i is consistent 
for every q > 0. 
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Recall the reduction map described in Section]^ We call (L„)„>i homogeneous if for all 
1 <m <n and every subsequence 1 < h < ■ ■ ■ < im ^ n 

(21) P{red(i:„(zi)---i:„(z^)) = a} =P{i:^ = a}, aeSm- 

We call consecutively homogeneous if ( [^ holds only for consecutive subsequences 

h,ii + l,...,ii+m-l. 

Lemma 5.2. The Mallows(q) process is consecutively homogeneous for all q > 0 and homogeneous 
for q = 1. 

Proof The q = 1 case corresponds to the uniform distribution, which is well known 
to be homogeneous. For arbitrary q > 0, consider the event {red(Ii„(;)••• + m - 

1)) = o} for some a e Sn- By the ordering construction, we can first generate L,„ = 
I,m(l) • • • T^rnim) from the Mallows(l/(^) distribution on [m]. We then obtain 'Lm+j-i from 
using the bumping construction for Mallows(l/(^) distribution. Thus, we have 'Lm+j-i ~ 
Mallows(l/(j) and its reversal ~ Mallows((j) withred(EJ)^^^._^( 7 ) • • • = 

E„,(m) • • • E,„(l) ~ Mallows((j). Finally, we obtain E„ by adding to according to the 

bumping construction, so that 

red(E„(;) • • • E„(?w+ 7 -l)) = red(E;;^_^ ._^(;) • • • i:l„^j_-^{m+i-l)) = E^(m) • • • E,„(l) Mallows(i?). 

This completes the proof. □ 

We say that E„ is dissociated if E „|,4 and E„|b are independent for all non-overlapping 
subsets A,B c [n]. If, instead, E „|,4 and E„|b are independent only when A and B are disjoint 
and each consists of consecutive indices, then we call E„ weakly dissociated. 

Lemma 5.3. For all n > 1, the Mallows(q) distribution on S„ is weakly dissociated for all q > 0 
and dissociated for q = 1. 

Proof For i' > i > 1 and m,m' > 0 satisfying i + m - 1 < i' and i' + m' - 1 < n, let 

A = {i,i + 1,... ,i + m - 1} and B = {i',i' + 1,... ,i' + m' - 1}. For any n > 1, we can 

construct a Mallows(( 7 ) permutation of [n] by first generating Ej+„,_i, for which we know 
that red(E;+m-i(0 • • • E;+m-i(i + m- 1)) ~ Mallows(( 7 ) by Lemma [5^ We then construct E„ 
from E;+,„_i by the bumping construction. Since bumping does not affect the reduction of 
any part of E„(l) • • • E„(f + m- 1), we have 

P{red(E„(f) • • • E„(z -I- m - 1)) = a | red(E„(F) • • • E„(z' +m' - 1)) = o'} = 

= P{red(E,+m-i(0 • • • E/+m-i(z -I- ?« - 1)) = a | red(E„(F) • • • E„(f' +m' - 1)) = o'] 

= P{Em(l)---E^(?M) = a}, 

proving that E„ is weakly dissociated. Dissociation of the uniform distribution (q = 1) is 
well known and so we omit its proof. The proof is complete. □ 

Together, the above properties facilitate study of consecutive pattern avoidance for 
Mallows permutations with arbitrary q > 0. For example, the pattern 231 has probability 
q^/{l +2q + 2q^ + q^) to occur in any stretch of three consecutive positions of a Mallows(i 7 ) 
permutation. Since there are n - 2 consecutive patterns of length 3 in a permutation of 
[n], the expected number of occurrences is (n - 2)q^/{l + 2q + 2q^ + q^). For large n and 
small q, this expected value behaves asymptotically as nq^, so that taking q ~ II y/n gives 
an expected number on the order of a constant. When q is large, the expected number of 
occurrences behaves as nq~^ for large n, and taking q ~ n gives an expected number on the 
order of a constant. 
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5.3. Poisson convergence theorems. Theorem 3.9 and its corollary follow by combining 
the above properties of Mallows permutations with Theorem 4.2 The calculations and 


resulting bounds for the general Mallows measure follow the same program as the uniform 
case proven in Section]^ with the key distinction that we only consider consecutive patterns 
for the general Mallows distribution. Unlike the uniform setting, the bounds for the 
Mallows distribution depend non-trivially on the parameter q and the structure of t. It is 
more fruitful to illustrate this dependence with specific examples than to regurgitate the 
same proof for Mallows permutations. 


5.3.1. Monotonic patterns under Mallows distribution. Consider the set of permutations that 
avoid the pattern 123. There are no inversions, and the size of the pattern is 3; thus, the 
probability of seeing this pattern in any given set of three consecutive indices of a Mallows(ij) 
permutation is l/I^iq). We also need to consider second moments, i.e., the probability of 
seeing two 123 patterns. By Lemma 5.3 we need only consider overlapping sets of indices. 
There are two cases, either two indices overlap or one does. If two indices overlap and the 
first three and last three both reduce to pattern 123, then the segment must reduce to 1234. 
Similarly, if one index overlaps, then the segment must reduce to 12345. 

The results below extend this notion to monotonic patterns. 


Lemma 5.4. Fix q > 0 and let ILn ~ Mallows{q). For each m >1, let Zm denote the pattern 12 -■■m. 
For each a e Jm, define 

For a random permutation generated using the Mallows measure, we have 

1 

E Xff — - ——, n € ]ffi, 

and for a,fe Jm, oc ^ jS, we have 

( j-p^ a, f have no overlapping elements 

rnyp 

^ a, f have exactly s overlapping elements, s = 1,2,... ,m - 1. 


Proof. The expression when a and f do not overlap is a consequence of the weak dissociation 
property of Mallows permutations (Lemma [53] ), whereby 

EX«X^ = EX«EX^ = P{red(E„|„) = = il/Imiq)f. 

When a and f overlap in s elements, the event {X^ = = 1} requires that both and 

Z„l |3 reduce to the increasing permutation, which can occur only if reduces to the 

increasing permutation of 2m — s. □ 

Proposition 5.5. Fix q > 0 and let ~ Mallows{q). For any m >2, let Zm be the increasing 
pattern 12 - ■ - m. For any a e Jm, let 

~ — T^). 

Let W = Tjaejm (^d let Y he an independent Poisson random variable with expected value 
A = E W. Then 

dTy(X(W),X(y))<2(&i+&2), 


n - m _ ni 

Imiq) Imiq)^ 


m-1 

&2 = n2 2]^ 

S=1 


1 

^2m-s{.q) 


where 
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and ni and n 2 are given by 

ni = 2mn - 3m^ + m, and 
(22) n 2 = 3m - 3m^ -2n + 2mn. 

Proof. When 2m - 1 < n, we have ni = 2 Lsli(^ -2m + s) = 2mn - 3mf + m, and similarly 
n 2 = 2 YjfSiin - 2m + s) = 3m - 3m^ -2n + 2mn. The factor of 2 is from exchanging the role 
of a,p. When 2m -1 > n, the stated expressions for ni and n 2 are still valid upper bounds, 
but they can be slightly improved. □ 

In addition, it is easy to state the complementary result about the decreasing pattern 
m - ■ - 21. 


Proposition 5.6. Fix q > 0 and let ~ Mallows{q). For any m >2, let rim be the decreasing 
pattern m - ■■21. For any a e Jm, let 

— ^i^ed{JLn\/f) = l]m). 

Let W = Las/,,, dnd let Y denote an independent Poisson random variable with expected value 
A = E W. Then 

dTy(X(W),X(Y))<2(&i+&2), 

where 

m-l am-s\ 

L q\ 2 ) 

7-TT' 

ilm-sif) 

and 

n 2 = 3m - 3m‘^ -2n + 2mn. 


A = (n- m) 

and ni and n 2 are given by 


qi2) 

fm(^) 


h = 


m q 


(”) 


Imiq) 


2 ' 


ni = 2mn - 3m^ + m, 


5.3.2. Other patterns of length 3. We now demonstrate the dependence of the total variation 
bound on q for the small patterns 132, 213,231, and 312. In this section, we again recall that 
Tm denotes the set of all m-tuples with consecutive elements in {1,2 ,..., n}, and for a given 
a Xa denotes the indicator random variable defined in (|^. 

For T = 132, we have E = q/l 2 ,(q), and there can be no consecutive occurrences of t that 
overlap with two indices. The only possible ways in which we can have one overlapping 
index are the patterns 13254,15243, and 14253. In these cases, we have 


^ in-3)q 

hiq) 


and 


^ (qf 13254, 

EX«X/5 = —-xLs, 14253, 

^ [i/4, 15243. 

Letting W = L^gfg E X^ and defining Y as an independent Poisson random variable with 
expectation A = E W = (n - 3) q/hiq), the total variation distance bound is given by 


hiqy^ 


+ 2(n - 5) 


q^ + q^ + i/4 


hiq) 


dTvi£iW),£iY))<2U3n-13) 
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The 2{n — 5) term comes from the 2 sets of triplets {1,2,3} and {2,3,4} for which the 
overlapping pair can only occur to the right of the elements, and similarly from the 2 sets 
of triplets {n - 2, n - 1, n} and {n - 3, n - 2, n - 1} for which the overlapping pair can only 
occur to the left of the elements, and finally the (n - 4 - 3) triplets in between for which 
the overlapping pairs are both to the left and the right; hence 2 + 2{n - 7) + 2 = 2{n - 5). 
Similarly, the 3n - 13 comes from 2 • 2 + 3(n - 7) + 2 • 2. Let f > 0 be fixed. If q ~ t n~^ or 
q ~ t we have A —> f, and drvi-CiW), £,(¥)) 

For T = 213, we similarly have 


O(n-i) 


A 


EXaXfi 


{n-3)q 

hiq) ' 

1 


h(q) 


X 


q^, 21435, 
q^, 31425, 
q^, 32415, 
2 


and 


dTv{£m,£{Y)) < 2 |(3n - 13) ^ + 2(n - 5) 


q^ + q^ + q^ 


and q ~ 
For T 


tn ^ or q 
= 231: 


hiqV ' ' hiq) 

implies A —> f and drv(X(W),X(T)) = 0{n~^). 


A 


(n - 3) q^ 


EXaXp = 


dTv(X(W),X(y))<2 (3n-13) 



and 


+ 2(n - 5) 


q^ + q'^ + q^ 


hiq) 


and q ~ or q ~ n implies A —> f and dTv(X(W), X(y)) = 0(n ^). 

And finally for t = 312: 


A = 


in - 3) q^ 

hiq) 


EX«X^ 


hiq) 


X 


dTv(X(W),X(y))<2 (3n-13) 


q^, 51423, 
q'^, 52413, 
q^, 53412, 


and 


q^ + q"^ + q^ 

and q ~ or q ~ n implies A —> f and dTv(X(W), £(¥)) = 0(n“^) 


6. Numerical examples 


6.1. Numerical values. Using Theorem 3.4 we can estimate |<S„(t)| for various sizes of 
patterns t. Table shows the lower bound thresholds for various values of n and patterns 
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n 

; 

lower 

n\ 

100 

36 

6.85456 X 10^^'' 

9.3326 X 10357 

1000 

133 

3.4433 X 10^5^^ 

4.6045 X 10^567 

10000 

442 

8.3847 X 1035658 

2.8463 X 1035659 

100000 

14353 

9.9451 X 1065657058 

1.2024 X 1065657059 


Table 1. Bounds on | <S„(t)| for t e Sj, for various values of n and j. 


n i lower n\ 


100 

6 

3.98735 X 10^5/ 

9.33262 X 10457 

1000 

7 

5.77948 X 10^366 

4.02387 X 10^567 

10000 

9 

2.49966 X 1035659 

2.84626 X 1035659 

100000 

10 

2.48004 X 10456573 

2.82423 X 10456573 

1000000 

11 

7.34802 X 105565708 

8.26393 X 105565708 


Table 2. Bounds on | <S„(t)| for t e Sj, for various values of n and /. 


permutation 

no. inversions 

permutation 

no. inversions 

3452671 

9 

3462571 

10 

3472561 

11 

3562471 

11 

3572461 

12 

4562371 

12 

4572361 

13 

3672451 

13 

4672351 

14 

5672341 

15 


Table 3. List of all permutations that have pattern 2341 in overlapping 
positions along with the number of inversions. 


of size j. Similarly, using Theorem 3.6 we estimate | <S„(t)| in Table|^ In the case of n = 1000 
and i = 133, we have more specifically 3.4433 x 10^^^^ < | iS„(t)| < 4.0239 x 10^^^^. 


6.2. Detailed illustration for the patterns 2341 and 23451. Propositions |5.5| and |5.6| give an 
expression for the total variation bound between the number of occurrences of the increasing 
and decreasing patterns and an independent Poisson random variable. In principle, these 
bounds can be computed exactly for any pattern by way of the Arratia-Goldstein-Gordon 
theorem (Theorem |4.2[ ); we need only compute the quantities &i, and as in Theorem 

m 

By Lemma [53} all Mallows((j) permutations are weakly dissociated and, therefore, &3 = 0 
for all patterns in the case of consecutive pattern avoidance. For any pattern t, homogeneity 
of the Mallows measure implies jlm{q) for all a, and so hi is easy to compute. The 

only complication involves the consideration of overlapping patterns in the calculation of 
& 2 . We cannot provide anything more general than Arratia-Goldstein-Gordon for arbitrary 
patterns; instead, we compute these bounds in the special cases of t = 2341 and t = 23451. 
Figurej^shows the performance of these bounds at the critical values q ~ and q ~ 
for T = 2341, and q ~ and q ~ for t = 23451. 


6.2.1. The pattern 2341. For t = 2341, we have pa = q^/hiq) and hi = (n - T)q^/lji{q). The 
structure of t only permits overlap with the first or last position. Tablej^lists all permutations 
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2341-avoidance probability (small q) 


23451-avoidance probability (small q) 




2341-avoidance probability (large q) 


23451-avoidance probability (large q) 




Figure 1. Plot of pattern avoidance probabilities of Mallows(ij) distribution 
for: (top left) pattern t = 2341 with q = (top right) pattern t = 23451 
with q = n“b4. (bottom left) pattern t = 2341 with q = and (bottom 
right) pattern t = 23451 with q = The dashed lines represent the upper 
and lower error bounds from the Arratia-Goldstein-Gordon theorem, and 
the solid line represent their average, i.e., the heuristic approximation. In all 
panels, the horizontal axis is on the logarithmic scale with base 10. 


that have pattern 2341 in the first 4 and last 4 positions. These are the only permutations that 
contribute to in the bound of Theorem 4.2 We assume n >7. For positions 5,... ,n — 5, 
each of these overlapping patterns can occur twice; otherwise, the patterns occur only once 
for a total of 2(n - 8) + 8 = 2n - 8 possibilities. There are 6(n - 8) + 2(5 + 4 + 3) = 6n - 24 
overlapping patterns a and that contribute to bi. Thus, 


A = (n - 4:)q^/h{q), 
b\ = (6n - 2^)q^Ihiqf-, and 

&2 = (2n - 8)q‘^{l + q + 2q^ + 2q^ + 2q‘^ + q^ + q^)/l 7 {q), 
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2341-avoidance probability (q=l) 

probability 



23451-avoidance probability (q=l) 

probability 



Figure 2. Plot of lower and upper bounds on pattern avoidance probabilities 
of uniform distribution (q = 1) for: (left) pattern t = 2341; (right) pattern 


T = 23451. 


permutation 

no. inversions 

permutation 

345627891 

12 

347825691 

345726891 

13 

347925681 

345826791 

14 

348925671 

345926781 

15 

357824691 

346725891 

14 

357924681 

346825791 

15 

358924671 

346924781 

16 

367824591 

356724891 

15 

367924581 

356824781 

16 

368924571 

356924781 

17 

457823691 

456723891 

16 

457923681 

456823791 

17 

458923671 

Table ^ 

:. List of all permutations that 


inversions 

permutation 

no. inversions 

16 

456923781 

18 

17 

467823591 

19 

18 

467923581 

20 

17 

468923571 

21 

18 

567823491 

20 

19 

567923481 

21 

18 

568923471 

22 

19 

378924561 

21 

20 

478923561 

22 

18 

578923461 

23 

19 

678923451 

24 

20 




pattern 23451 in overlapping 


positions along with the number of inversions. 


producing the bounds 


e 


-A 


- ih + b2) 


1 - e 


< P(W = 0) 


< e 


-A 


+ (bi + ^ 2 ) 


1 - 


6.2.2. The pattern 23451. For t = 23451, we have pa = q^/Isiq) and h\ = {n - 5)q‘^/l5(q). 
The structure of t only permits overlap with the first or last position. Table lists all 
permutations that have pattern 23451 in the first 5 and last 5 positions. These are the only 
permutations that contribute to &2 in the bound of Theorem 4.2 We assume n > 9. For 
positions 5,... ,n - 5, each of these overlapping patterns can occur twice; otherwise, the 
patterns occur only once for a total of 2(n - 10) + 10 = 2n - 10 possibilities. There are 
8(n -10) + 2(7 + 6 + 5 + 4) = 8n - 36 overlapping patterns a and that contribute to bi. Thus, 


A = (n - 5)qVl5{q), 
bi = (8n - 36)q^/Isiq)^, and 

&2 = (2 m - 10)q^^{l + q + 2q^ + 3q^ + Aq^ + Aq^ + 5q^ + Aq^ + Aq^ + 3q‘^ + 2q^^ + q^^ + q^^)/I^(q), 
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producing the bounds 


e-^-{h + b2)- 


,-A 


A 


< P{W = 0) 


< e 


-A 


+ (bi + bi) 


1-e-^ 


7. Proofs 

7.1. Bounds on pa and pa^. We first prove several lemmas, from which the theorems follow. 
By the homogeneity property of uniform permutations we have 

Pa = EXa = 1/;! for all a e Pp 

To calculate the Poisson rate A, we use linearity of expectation: there are (”) possible /-tuples 
of elements in [n], and so the expected number of subsets of ] elements that reduce to the 
pattern kik 2 ■■■kj is 

A =ICIp. = (")//!. 

Next, we consider the joint expectation pa^ = E X^X^. 

Lemma 7.1. Fix a,/l € Py and let s = 1,..., j - 1 denote the number of elements that a,^ have in 
common. For any such pair a, jS, we have 

(23) Pa|3 < Tjy- 

Proof. First we condition on Xa, which contributes a factor of 1//!. By conditioning on Xa, 
we assume that the s common elements are in their proper order with respect to X^. It may 
so happen that, conditional on Xa, no such event can occur, which justifies the inequality. 

Consider first s = / - 1, i.e., condition on / - 1 of the entries being in their proper order. 
Assuming that it is possible to realize both events simultaneously, the remaining element 
has probability 1/j of appearing in its correct order. 

Consider, for general s, conditional on s entries being in their proper order, the probability 
that the remaining /-s elements appear in their proper order is then ((s + l)(s + 2) • • • j)~^. □ 


7.2. Proof of Theorem |3.4[ We have the following lemma. 


Lemma 7.2. For Da defined as in Section 3.1 and b^ as in Theorem 4.2 we have bj, = 0 for all 
patterns t. 


Proof. This follows from the dissociated property of uniform permutations (Lemma |5.3[ >. 
We interpret the conditioning on o{X^ : fi i Da) as the a-algebra containing all possible 
information about just the order of a particular set of three elements. Since these three 
elements do not overlap any of the elements in a, knowing only their order does not affect 
Xa because uniform permutations are dissociated. □ 


Remark 7.3. Note that the conditioning in the expression for b^ is not the a-algebra containing 
all information about the elements indexed by each tuple. If it were, then knowing their particular 
location would have an impact. However, simply knowing their order does not reveal any more 
information about Xa. 


Next we obtain the size of Da, which is the same for all a e Tj. 
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Lemma 7.4. For each a e Fj, 



Proof. Fix any a, e Tj and let s = 1,...,; denote the number of elements that a, have in 
common. (This includes the case a = |S.) For each s = 1,2,...,;, we select any s elements 
out of the i for the two sets of indices a, ji to have in common, then we select the remaining 
j - s elements from the n - j remaining elements that are not in a. That is. 


\Da 


i 

L 

C = 1 


n-nu 

i - sj\s 



Lemma 7.5. We have 




for all a eTj. 




If- 


□ 


Proof. Follows immediately from Lemma |7.4| and Lemma |7.1[ □ 

The expression given for ^2 in Equation ( [To| | in the statement of Theorem |3.4| is straightfor¬ 
ward, although it contains the overlap quantities Ls(t), which can vary wildly for different 
patterns t, and for which we are unaware of any general explicit or asymptotic expression. 
We calculate explicit upper bounds for d 2 in Lemma [7!7| 


7.3. Proof of Theorem |3.6| and Theorem |3.9[ Theorem |3^ follows from Theorem |3^ using 
q = 1 and the fact that !«(!) = nl. Theorem |3.9| is a straightforward generalization of 
Proposition |5.5[ 


7.4. Proof of corollaries. For the proof of Corollary |3.1| it is sufficient that the bounds for 
di and ^2 in Theorem |3.4| converge to 0 as n — > oo. 


Lemma 7.6. Suppose j >{e + e) y/n. Then for di as in Theorem 3.4 we have di —> 0. 


Proof. The proof is an elementary application of Stirling's formula. 


□ 


For d 2 , the asymptotic analysis is not so straightforward, which is why we instead use 
the inequality in Equation ( p^ . 

Lemma 7.7. We have 


J \J 


s! < 


e^(i2 u) Ie^{n - i)\^ 1 


{2nf 


- e 
} 




ogO')- 


Whence, for any rj >0, taking j > (ee^^^ -l- p) yfn, we have d 2 0. 


Proof. We count the number of pairs {a, |S), a e F; and ^ ^ Da with exactly s shared elements. 
We may first choose any 2; - s locations among the n possible choices for the patterns to 
occur on. Of those 2; - s locations, we can choose any ; of them for the elements of a. Then, 
of those ; locations, any s can also be shared with |3. Thus, for a given s e {1,2,1}, 
there are 





'i;:* 
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terms in the sum. Using Equation ([^, we have 






n-ni 




s! 




]■ 1 \; -s/w 

In order to handle the sum, first we recall the quantitative bounds of Robbins II35II , i.e., 

nl 


1 

^12 h +1 < 


(n/e)" y/lnn 


<612,,^ foralln>l. 


Again we emphasize that this inequality holds for all n > 1, which allows us to provide the 
simpler bound of 


/ 2/ Avi-s 

- / 1 — i»\-' ^ n-] 


e (n - j) 


The term 


Ie^{n-j) y ® 

\ (Hf ) 


is maximized when j - s = ^Jn - j, whence 


* - 4“p((]I - - 2S“p((u - 4F'^>°sO). 


Note next that 


S = 1 


A < 


2 \i -— 
e n\ e « 


so that for j ~ {e + e) yfn, we have 


A < 


('u) 


;2 j Ini’ 

Ij Q-(e+ef 

In j 


and so 


d2 <((i + £)(e-)) 




2 ^/ \l + f/ 27 z; 

Letting e = -!) + ?] for any j] > 0, we conclude that taking j > + ?]) ^fn implies 

d2^0. □ 

We now compute an upper bound on ^2 which is explicit and establishes Corollary |3.2[ 

Lemma 7.8. 

m-l , 

n , r. s s! n 1 

di < / {n - 2m + s)- 

S=1 


(?w)!2 m\ m 

Taking m > T^~^\n/t) - 1 and m ~ T^~^\n/t) - 1, we have d2 < ^ —> 0 as n tends to infinity. 
Proof. It is easy to see that 

^ =^(l + 0(m-i)), 

m' V A 


m-l 

L 


S=1 


(m + s-1)! m\ 
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and also 

m-l ^ 

y 7 — 

{m +s - ly. m\^ ' 

Using Equation the result immediately follows. □ 

For a more explicit form of the growth of m, we define y>{x) = r'(x)/r(x), the digamma 
function, as the logarithmic derivative of the gamma function, and denote by ko the smallest 
positive root of i/'Cv), i.e., ko = 1.46163.... Also, let c = e~^ - r(A:o) = 0.036534..., and 

denote by W(v) the Lambert W function, i.e., the solution to v = W{x)e^^^\ Finally, let 
L{x) := log((x + c)/2n). 

Lemma 7.9 ( KTll l. As x tends to infinity, we have 

^ +1 _ 

W(L(x)/e) 2 loglog(x) - logloglog(x) 


Lemma 7.10. Suppose t > 0 is some fixed constant and m = [L^ ^\nlt) - 1]. Then 


A = 


n-m 


ml 


Remark 7.11. We must be slightly careful when specifying the length of the pattern m in Lemma 7.10, 
since in general F^“^^ {njt)-! will not be an integer. However, as long as m always exceeds this value, 
which we have ensured by setting it equal to the smallest integer exceeding it, then the asymptotic 
expressions still hold. 
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